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Abstract 

Resampling is a standard step in particle filters and more generally sequential Monte Carlo methods. 
We present an algorithm, called chopthin, for resampling weighted particles. In contrast to standard 
resampling methods the algorithm does not produce a set of equally weighted particles; instead it merely 
enforces an upper bound on the ratio between the weights. Simulation studies show that the chopthin 
algorithm consistently outperforms standard resampling methods. The algorithms chops up particles 
with large weight and thins out particles with low weight, hence its name. It implicitly guarantees 
a lower bound on the effective sample size. The algorithm can be implemented efficiently, making it 
practically useful. We show that the expected computational effort is linear in the number of particles. 
Implementations for C++, R (on CRAN), Python and Matlab are available. 
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1 Introduction 


Particle filters and more generally sequential Monte Carlo methods have gained importance and widespread 
use Doucet et al. (2001). One of their key steps is resampling, which is intended to prevent weight degeneracy. 
Broadly speaking, resampling starts with a set of particles xi,... ,Xn with associated weights wi,... ,Wn and 
produces a new set of particles (a subset of the original set with potentially duplicates) with less uneven 
weights (often equal weights). 

A commonly used resampling algorithm is multinomial sampling, which selects a new set of particles 
by sampling n times with replacement from Xi,... ^Xn with probabilities proportional to ici,..., w„. Other 


resampling schemes have been proposed, for example systematic resampling (Whitley 
19991, stratified resampling ( Kitagawa] [1996 ), residual resampling Liu and Chen 


1994 


Carpenter et al. 


(1998) and branching 


2009, p. 278). All of these algorithms return a set of particles with equal 


resampling (Bain and Crisan 
weights. 

The general consensus seems to be that, whilst it is possible to outperform multinomial resampling, the 
more advanced methods such as residual, stratified and systematic resampling are comparable in terms of 


their performance in particle filters (Done and Cappe, 2005 Hoi et al. 2006). 


In this article we show that it is possible to improve the performance of the resampling step significantly. 
We do this by presenting a new resampling method that consistently outperforms the aforementioned meth¬ 
ods. 

The new algorithm, called chopthin, ensures that the weights are not too uneven by enforcing an upper 
bound, ry, on the ratio between the resulting weight. Chopthin can outperform other methods because it 
does not return particles with equal weights. 

The chopthin algorithm enforces the upper bound, 77 , on the ratio between the weights, as follows: 
Particles with large weights, above a threshold a, are potentially “chopped”, i.e. replicated with the original 
weight spread among the replicates. Particles with small weights, below the threshold a, are “thinned” by 
randomly deciding whether they should be deleted or kept, adjusting the weights by the selection probability 
to ensure unbiasedness. A similar approach to the thinning part of chopthin is used in |Fearnhead and Clifford] 
(2003) where the optimality of such a resampling method is shown in a certain sense. 


Particle filters often only perform the resampling step if a criterion of the unevenness of the weights, such 
as the effective sample size (ESS), drops below a fixed threshold. This avoids resampling if the weights are 
relatively even and thus reduces the noise being introduced through the resampling. This results in measures 
of the evenness of the particles such as the ESS to fluctuate over time. 
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In contrast to this, chopthin can be executed at every step of a particle filter. This is because chopthin 
evens out the weights less than existing schemes. It will not alter the weights much (or at all) if they are 
already relatively even. Using it at every step leads to less fluctuation in the unevenness of the weights over 
time. Figure]^ (later in the paper) illustrates this in an example by looking at the ESS over time. 

Chopthin can be implemented efficiently. Indeed, we present one version of chopthin, which can be 
implemented in expected constant linear effort in the number of particles. 

We begin by presenting the generic chopthin algorithm in SectionIn Section]^ we present a version of 
the algorithm that has expected linear effort and show in a simulation that its effort is comparable to other 
standard resampling methods. Simulation studies are conducted in Section that compares the chopthin 
algorithm to other resampling schemes within a particle hlter. The results show that our new algorithm 
consistently outperforms the other resampling methods. In Section we prove that the algorithm implicitly 
controls the ESS. 

Implementations of chopthin are available: as an R-package (chopthin on GRAN), as a python package 
(on the python package index), as C++ code and as a Matlab extension file (homepage of the first author). 


2 The Generic Algorithm 


Before introducing the chopthin algorithm, we first present the constraints that it satisfies. Denote the n 
particle weights before resampling as ici,..., Wn and let Q be the cr-field generated by u>i,..., Wn- Further, 
denote the N weights after resampling as wi,... ,wn- Let C* be the number of replicates of particle i. We 
want chopthin to satisfy the following: 


(i) EiC^w,\g) = w,Wi 

n 

(ii) Y.c^ = n 


2 = 1 


N 


(hi) = 


(Unbiasedness) 
(Target count) 

(Conserve weight) 


(iv) ^ < r? Vi, j 

Wj 


(Bounded ratio) 


Property 0 is an unbiasedness condition ensuring the expected total weight of the offspring of a particle 
is equal to its original weight. Property (|^ ensures that exactly N particles are returned after chopthin. 
Typically, N = n i.e. the number of particles is conserved. Properties (p]) and §) are satisfied by other 


resampling methods (|Douc and Cappe| |2005| . Property (|m|) ensures that the total sum of the weights before 

and 


and after resampling are equal. Property ( |i[) a nd ( |iii[ ) ensure that any estimator based on the normalised 
weights will be unbiased. Finally, property ( |iv[ ) bounds the ratio of weights returned from chopthin. 

Algorithm is a generic version of chopthin. As input it receives the weights {wi)i:n of n particles, rj, 
the desired upper bound on the ratio between weights, and N, the number of particles to be returned. 

Every particle gets a (potentially) random number of descendants. For a particle with weight w, the 
expected number of offspring from chopthin will be K^iw), where : [0,oo) —> [0,oo) is a given function 
which may depend on rj and on a further threshold parameter a. To ensure that N particles are returned 
(in expectation), we need to hnd a such that 


Y,Kiw,) = N. ( 1 ) 

i=l 

The mechanism that generates the descendants depends on the weight of the particle as well as on the 
parameter ry, which is specihed by the user, and the parameter a, which is determined by the algorithm. 
The key steps of Algorithm are: 

Find a (Step[^: The parameter a will serve as a threshold parameter that determines which particles 
are “thinned” and which are “chopped”. 
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Algorithm 1: Generic chopthin 


Input: particle weights {'Wi)i-,n\ maximal weight ratio ry; target number of particles iV; function 
K ■■ [0,oo) -)► [0,oo) 

Output: ancestors I S n}^, weights w G [0, oo)^ 

1 Let a be a solution to 

Let L = {j : Wj < a} and U = {j : Wj > a} 

Let 1 = 0 and w = 0 

2 Draw u ~ [7(0,1) // Thin 

for i G L do 

u = u + h1{wi) 

if u > 1 then 

append (i) to I and (a) to w 
u = u — 1 


3 Let Nl = length(/), Nu = N - N^- 
and C := 

4 Systematic[{{/i2(wj)}; j G U} ;Nu]; returns for j G U. 

for i G U do 

C = [K{Wi)\ + TO* 

append (i,..., i) G to / 

Wi=Wi + C{h2{wi)} 

append (^,..., ^) G to w 

return I, w 


// Adjusted weight 
// Chop 


Thin (Step[^: Particles with weights below a get “thinned”, i.e. either have 1 offspring (with weight 
a) or 0 offspring. 

Chop (Step|^: Particles with weights above a get “chopped”, which means that they get subdivided 
into smaller pieces, dividing the total original weight. 

The chopthin algorithm returns a vector of resampled weights {wi)i:N and an integer vector, /, containing 
the indices of resampled components of the original weights. Chopthin will return weights between a and 
rja. This way the bound on the ratio of the weights, property ( p^ , will be satisfied. 

We now discuss Algorithm in detail. In Step the threshold parameter a is found by solving Q. This 
depends on the choice of function h^. Choosing and solving Q are discussed toward to end of this section 
and in Section [S) 

The thinning step (Step[^ determines the new weight and the number of offspring for particles with small 
weights, Wi < a. Descending particles will have weight a, thus ensuring the range condition on the weights. 
The unbiasedness property 0 requires h^{w) = wja^ uniquely determining in this range. The number of 
offspring is determined by systematic resampling on {h2{wi) : Wi < a}, ensuring 0 or 1 descendants. 

Step returns Nl particles such that E(Ai|^) = The total weight of the surviving 

thinned particles is aN^. Thus, through the thinning step, the total sum of the weights may have changed. 
We compensate for this using C (step in the chopping step, thus ensuring property ( pli| 

The chopping step (Step[^ determines how the large weights, Wi > a, are subdivided. Each large weight 
will receive c* = \ hJl{wi)\ +to* offspring. The to* are determined by a second systematic resampling step on 
the fractional parts where {x} := x — [ccj. Performing systematic resampling on these fractional 

parts ensures E(C*|t/) = This holds because the expected value of to* is (C/o + l){7ia(wi)} and 

E(C|f/) = 0. Further, this resampling step will return exactly Nu particles. The value of Nu is selected 
such that the total number of offspring produced from the entire algorithm is exactly N (step . Thus 
property © is satisfied. Before chopping, the original weight is first adjusted using The adjusted weight 
is Wi = Wi -\- C{^2('f^i)}- This adjustment ensures that the totals sum of the weights is conserved, property 
([iii| and that the chopped weights are unbiased 0. 
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Figure 1: Illustration of allowable chopping region, A, represented by gray area with black border. The 
black point represents a candidate point (w^h^^w)), the arrows represent movement from the point by the 
adjustment vector p(l,l/a). The dark gray lined areas represent regions where {w,h^{w)) cannot lie. All 
dashed lines have a gradient of 1/a. 


The restriction that the chopped weights are between a and -qa requires 

w w 

a < and -j —- < qa, vw > a 

m [c\ 

where w = w — C{^a('*^)} fti® adjusted original weight and c is the number of offspring. These constraints 
define an area, A, in the (adjusted) weight-count space where 


A := |(w,c) : 

This region is illustrated in Figure[^by the light grey area with black border for the case q = 4.5. This area 
is only valid if 77 > 2 . 

To show that the (w, c) used in Algorithm lies in A, we first write the adjusted weight and count as 

(:)=(.,<^) 

where p := C{^2(^)}- We shall refer to the vector p(l,l/a) as the adjustment vector as it adjusts the 
original weight, w, to the adjusted weight, w. 

The requirement that (w,c) S A leads to constraints on h'li'w). Beside choosing h^{w) such that 
(w,h^{w)) G A we also need to ensure that the adjusted weight {w,c) is also in A. Possible constraints 
ensuring this are 

K[w) < ^ - 1 

a 

and 

> — — m{q — 1) -I-1 (3) 

a 

for a{qm — 1) < w < qam, m = 1,2,.... The regions where {w, h2{w)) are not allowed are represented by 
the dark grey lined areas in Figure]^ An alternative choice for could be to choose such that {w)} = 0 

for all w so that p — 0 (see end of this section). In this case, it is sufficient that {w, h'l{w)) G A. 

Figurej^illustrates the systematic resampling used in the first for-loop in Algorithm[^ where h^iw) = wja 
denotes the expected number of offspring for a particle with current weight w < a. This depends on the 



4 


















u 

K{wi) 

KM 

KM) 

KM) 

KM) 



0 1 2 3 4 5 


Figure 2: Illustration of systematic resampling in Algorithm 



weight, w 

Figure 3: Expected number of offspring as a function of the weight. Dot-dashed: chopthin § ; gray 
dotted: step-chopthin Q. 


threshold a. All particles have h^iwi) < 1. Particle 1, 2, 4 and 5 each get one descendent and particle 3 
receives no descendant. 

We have considerable freedom in choosing for w > a. One natural choice would be 




w/a if w < a 

\w/{r]a)~\ if w > a 


( 4 ) 


illustrated in Figure]^ We call the resulting algorithm step-chopthin. The requirement that {w,h2{w)) G A 
implies \w/(riay\ < [w/aj for all w > a. Considering values of w slightly less than 2a implies 77 > 2. 

This choice does not guarantee the existence of a solution a of Q due to the discontinuities. Instead of 
having an exact solution, one could use an approximate solution, using a numerical root finding algorithm, 
but this would not guarantee that the desired number of particles is returned property (ii). 


3 Implementation in expected linear time 

In this section we present our main version of the algorithm, which we simply call chopthin. For this we 
choose such that it is continuous (in a) and such that 0 can be solved for a in expected linear effort. 
Consider the function 

{ w/a if w < a 

1 if a < w < 770/2 (5) 

2w/rja if 7 C > 770/2 

which is depicted in Figure]^ The requirement that {w,h2iw)) G A implies 2w/rja < [tc/oJ for all w > a. 
Considering values of w slightly less than 2a implies rj > A. 

We use Algorithmto solve ^ using ([^ for a. Lemmaproves that the expected effort 

of Algorithmis linear in ti, and overall the expected computational effort is 0 (max( 7 T,, iV)). 

Algorithm^ works by determining which weights are above or below a and which weights are above or 
below 77 a, without fully knowing a yet. Due to the piecewise linear structure of the contributions of 
weights for which this determination has been made can be easily kept track of by the number and the sum 
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Algorithm 2: Fast determination of a 


Input: particle weights Wi\ maximal weight ratio 77 ; target number of particles N 
Output: a > 0 such that = N 

= w, = 0, c™ = 0, = 0, c“ = 0 

while w'^ ^ % or ^ % do 

if |w*| > |w“| then sample a uniformly from and let & = 770/2 
else sample b uniformly from 777 “ and let a = 2b/rj 
h = s'-/a + J2v&w‘ niin(7;/a, 1) + c™ + max(f -1,0) + s“/6 - c" 

if h=N then return a 
ii h > N then 

s' = s' + 'Evgw^ ^ a) 
w' = {v € w']v > a}, 77;“ = {u G 77 i“; v > b} 
else 

C™ = c”" + Ev&w‘ I(^' ^ a). s“ = + 'Ev&w‘ ^ c“ = c“ + 'Zv&w‘ ^ 

w' = {v € w'-,v < a}, 77;“ = {u G 771“; 7; < 6 } 


1 s ^+ 2 s ^/?7 

return a = i^r -m / L 


Table 1: Example run of Algorithm with N = n = 5 and 77 = 4. 


Wi 

a 


Wu 

b 

h 

10.1,0.3,0.5, 0.9 ,1} 

0.9 

{0.1,0.3,0.5,0.9,1} 

1.8 

3 

10.1,0.3,0.5} 

0.15 

{0.1, 

0.3 

,0.5,0.9,1} 

0.3 

9.67 

{0.3,0.51 

0.25 

{ 

0.5 

,0.9,1} 

0.5 

6.2 

{ 0.3 ,0.5} 

0.3 


{0.9,1} 

0.6 

5.5 

{0.5} 

0.5 


{0.9, Hi 

1 

3.8 

0 

0 

r 

0.45 

andom 

{0.9} 

0 

iy chosen element 

0.9 

3.89 


of those particles (see s', c^, s“, c“ and the computation of h in Algorithm]^. The algorithm maintains 
two lists — w', the weights for which we do not know yet whether they are above or below a, and w'^, the 
weights for which we do not know yet whether they are above or below 770 / 2 . The exact value of a is only 
determined when h = N or when both w' and 771 “ are empty. At every iteration, a new candidate for a or 
b = 770/2 is selected from the longer of w' and 77;“. Depending on whether h > N or h < N the alrarithm 
then removes elements from w' and 77;“ and updates the counts/sums of decided weights. See Table for an 
illustrative run through of Algorithm for N = n = 5. 

Lemma 1. The expected effort of Algorithm^ is 0{n). The expected effort of Algorithm^ together with 
Algorithm^ is 0{max{n, N)). 

Proof. We use a subscript to denote iterations in Algorithm with 77 ;( = w = wf. The effort in the Ah 
iteration of the while-loop is proportional to the number of elements in tc* and wf. Thus the overall effort 
is proportional to 

00 

Consider iteration i. The following statements are conditional on the sets 77 ;)_;^, 77 ;“_^. Suppose that | 77 ;)_;^| > 
\w'i-i\- We show that E(| 77 ;(|) < ( 3 / 4 )| 77 ;(_^|. Let a be the randomly selected element from 77 i(_;^. Let a* be 
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Table 2: Effort of resampling N particles divided by the effort to generate N exponentially distributed 
random variables in R 


N 

1000 

10000 

lO'’ 

10 “ 

chopthin 

1.77 

1.53 

1.53 

1.64 

systematic 

0.43 

0.34 

0.35 

0.35 

multinomial (sample.int) 

0.88 

0.89 

1.02 

1.36 

multinomial (cond. Binomial) 

1.81 

1.90 

1.92 

2.03 


such that X]r=i Let = |{u G w\_i : v < a*}|, M“ = |{u G w\_i : v > a*}|. We then have 


E(k'l) = 

= ¥.(\w\ 


a < a 


M‘ 


= (x + 

(M' +M“)2 


m-i\ 


-E(\wI\ 

~Y 


a > a 




w. 


i-l\ 


■M‘ 




2\w: 


i-l I 




Hence, E(|u;'| + \wf\) < ||w'_i| + K_i| < |(|u;'_i| + K_i|) as 
that the above also holds if |w-_i| < 

Thus, 


Similarly, it can be seen 


E(k'l + KI)=E E 


7, 


Wi 


w. 




< -E(|u;Ul + l<-il) 

i-l 


< 


< 


e(Ki + ki) = 


i-l 


2n 


and therefore 


E 


Ed^ii + Ki) 


ii: 


i=l 


2n = 2n 


1 


1 - 7/8 


= 16n. 


This shows that the expected effort of Algorithm]^ is 0{n). The remainder of Algorithm [^entails generating 
the output of length N and it runs through all n particles with an overall effort of 0(max(n, N)). Thus the 
expected effort of the combined Algorithmsis 0(max(n, A^)). □ 

We now compare the effort of chopthin to the effort of sampling with replacement (multinomial resam¬ 
pling), via the in-built function sample. int in R and via a method using conditional Binomial distributions 
( Davis[ [1993 ) and a (fast) C++-based implementation of systematic resampling. 

We simulated N weights from an Exponential distribution, i.e. Wi ~ Exp(l), i = 1,..., A^ independently. 
We then applied the resampling procedures to the simulated weights. 

Table [^reports the mean effort of the resampling procedures over 10000 repetitions. The reported effort 
is relative to the effort to generate the weights (a call of the in-built R function rexp). Constant values 
indicate that the effort is linear in N, as the effort of generating the random variables is linear in N. 

Systematic, chopthin and multinomial resampling (the conditional Binomial implementations) are all 
approximately linear in N. As expected, chopthin is more computationally demanding than systematic 
resampling as part of the chopthin algorithm consists of systematic resampling steps. 

Nevertheless, the computational effort of chopthin is very moderate, only slightly more than generating 
exponentially distributed random variables. 
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Algorithm 3: Particle filter 

Input: target number of particles iV; ESS threshold j3] resampling scheme r; observations yi, ■ ■ ■ ,yT- 

Output: weighted particles (wi, for t = 1,..., T 

Sample ~ p(xo), i = 1, ■ ■ ■, JV 

Wi = 1, i = 1,... ,N 

Let Tig = N 

for t = 1 ,..., r do 

Sample ~ p{xt\x[^li), i = 1,... ,nt_i 

Wi = Wip{yt\xl'''*), i = 
if ESS(w) < P then 

Run r with a target of N particles to get a set of particles {wi,x^^'’)i-nt 
Normalise weights such that Wi = N 


4 Simulations 

We now compare the performance of chopthin to other resampling methods within a particle hlter. We also 
vary the bound of the ratio on the weights, rj, and illustrate that chopthin results in a less variable ESS. 


4.1 Linear Gaussian Model 


Consider a model with hidden Markov process Aj € K and observed process Yj G K for t G N. In this section, 
we are interested in the model 


A* = Xt-i + et, et “ A(0,1) 
Yt =Xt+^t, 


with Aq A(0,1) and known ay > 0. For this model the Kalman filter (Kalman 
conditional distribution, giving us a benchmark. 


1960) gives the exact 


4.1.1 Simulation 

We use the particle filter in Algorithm to give estimates of the hidden states Ai,...,Ar based on the 
observations yi,. ■ ■ ,yT- We select p(xt\xt-i) and p{yt\xt) as indicated by the linear Gaussian model. We 
are interested in the posterior Xt\yi,.. .yt for t = 1,..., T. Resampling is performed if the ESS drops below 
P G [0, N], The ESS of a weight vector, w = {wi,... ,Wn) is defined as 


ESS(w) 




It is often used in particle Liters to trigger the resampling step, li P = N then resampling is performed at 
every step as ESS(?u) < N. Lastly, potentially any resampling scheme r can be used in Algorithm]^ 

For a given ay, resampling scheme r, target number of particles N and resampling trigger /3, a single 
iteration of the simulation is conducted as follows: simulate from the model T = 1000 observations; yi,... ,yT- 
Using this realisation of observations, run the particle filter to give estimates of the hidden states Ai,..., Xy. 
Lastly, the Kalman filter is run to obtain the exact conditional distribution. We use M = 1000 iterations. 
The simulation is conducted using combinations of the parameters: cry, N, /3, t] (for chopthin) and various 
resampling schemes. 












multinomial, |3 = 0.5N 


systematic, |3 = 0.5N 


O 



. 1 


- number of particles 

□ ESS before resampling 
♦ ESS after resampling 
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chopthin, ri = 3+V8 



0 10 20 30 40 50 



0 10 20 30 40 50 


Figure 4: ESS before and after resampling for selected resampling schemes (one realisation). 


4.1.2 Illustration of One Run 

Figure considers the effect of different resampling schemes on the ESS during the first 50 steps of one 
realisation of the particle filter (Algorithm]^ with N = 10000 target particles. It plots the ESS before and 
after resampling. As resampling for the multinomial and systematic algorithm only occurs if the ESS has 
dropped below 0.5A^, the ESS is far more variable than in the chopthin algorithm. For both r] = 3 + -s/S 
and 77 = 10 , the chopthin algorithm after resampling stays significantly above its theoretical lower bound 
(given in Section]^, which is O.SA^ and 0.33N, respectively. Also the two choices of 77 within chopthin lead 
to similar behaviour. 


4.1.3 Results 


Tablej^shows the results of the full simulation for the following resamplers: chopthin, multinomial resampling 
(resampling with replacement), branching (Bain and Crisan, 2009 p. 278), stratified sampling, standard 


residual sampling (multinomial resampling of the residuals), residual sampling with stratified resampling of 
the residuals and systematic resampling. 

For each iteration, we obtain the estimated posterior mean of Xt for t = 1,... ,T. For a given ay, X and 
/3, denote the estimated posterior mean from iteration i, at time t, for resampling scheme r as Jli.t.r- Further, 
denotes the true posterior mean at time t given by the Kalman filter. We report the approximate mean 
squared error (MSE) for resampling scheme r as 


M r T 
2^1 I, 

The MSE values, presented in Table are divided by the MSE given by the systematic resampling. The 
results show that using the chopthin algorithm at every step (/3 = N) and using the trigger (/3 = 0.577) with 
various values for the ratio bound 77 consistently achieves a lower MSE than the other resampling methods. 
The simulations using ay = 1/3 is based on a setting where there is a small amount of noise between the 
state and observation. In this case, the particle filter will be resampling at nearly every step for all methods. 


9 









































Table 3: Simulations Results - linear Gaussian model: MSE values for various simulation parameters and 
different resampling methods. Presented MSE values are divided by the MSE from the simulations using 
systematic resampling. 



/3 

V 


N 

ay 

100 

1/3 

100 

1 

100 

3 

100 

9 

10 ^ 

1/3 

10 ^ 

1 

10 ^ 

3 

10 ^ 

9 

10 ^ 

1/3 

10 ^ 

1 

10 ^ 

3 

10 ^ 

9 

chopthin 

N 

4 



0.99 

0.90 

0.88 

0.91 

0.98 

0.90 

0.89 

0.92 

0.97 

0.91 

0.90 

0.94 

chopthin 

N 

3 + 

78 


0.97 

0.90 

0.86 

0.86 

1.00 

0.89 

0.86 

0.87 

0.95 

0.90 

0.89 

0.90 

chopthin 

N 

10 



0.97 

0.92 

0.87 

0.85 

0.98 

0.91 

0.87 

0.86 

0.94 

0.93 

0.86 

0.85 

chopthin 

0.57V 

3 + 

78 


0.98 

0.98 

0.96 

0.94 

0.97 

0.98 

0.96 

0.94 

0.96 

0.98 

0.96 

0.94 

multinomial 

0.57V 

- 



1.01 

1.05 

1.15 

1.21 

0.99 

1.04 

1.15 

1.24 

0.98 

1.04 

1.15 

1.22 

branching 

0.57V 

- 



1.01 

1.01 

0.99 

0.99 

1.00 

0.99 

1.00 

1.01 

0.95 

1.01 

1.00 

1.00 

residual 

0.57V 

- 



1.00 

1.00 

1.00 

0.99 

0.99 

1.00 

1.00 

1.01 

1.00 

1.00 

1.00 

1.01 

stratified 

0.57V 

- 



1.00 

1.00 

1.01 

1.02 

0.99 

1.01 

1.02 

1.04 

0.96 

1.02 

1.01 

1.01 

residual-stratified 

0.57V 

- 



1.00 

1.01 

1.01 

1.01 

0.97 

1.00 

1.01 

1.01 

0.96 

1.03 

1.00 

1.05 

systematic 

TV 

- 



1.00 

0.96 

1.06 

1.37 

1.01 

0.96 

1.11 

1.44 

0.98 

1.00 

1.13 

1.47 

systematic 

0.57V 

- 



1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 


Underline: below 0.9. 


Chopthin with /3 = 0.57V is included in these simulations to support our suggestion that chopthin should be 
used in every iteration of a particle filter. In similar simulations, not presented here, we compared the MSE 
of using the chopthin with P = N and systematic resampling with various values of /3. These simulations 
still showed that the chopthin method consistently outperforms systematic resampling. 

In general, chopthin appears to perform better than other resampling methods, particularly when ay is 
large. This may be due to a combination of factors. First, chopthin with (3 = N keeps the quality of the 
particle approximation more stable than methods using the ESS as resampling trigger (see Figure]^. Second, 
compared to using a standard resampling scheme at every iteration (/3 = N), chopthin leaves particles with 
weights between a and r]a/2 unchanged; only thinning the particles with weights less than a and chopping 
those above 770 / 2 . As a result, a better particle system seems to be maintained. 


4.1.4 Estimation of the Likelihood 

The likelihood of the observations p{yi, ■ ■ ■, yr) can be decomposed as 

T 

p(,yi, ■■■,yT) = Y[p{yt\yv.t-i)- 

t=i 


The conditional distribution p{yt\yi-.t-i) can be approximated from these simulations the average of the 
weights; that is 

1 ^ 

p{yt\yi:t-i) = 

^ k=l 


where the Wk are the weights after the conditioning on the observation yt. 

Unbiased estimation of the marginal likelihoods, p{yi-.t), is particularly important in particle MCMC 

, [ 201 ^ 


methods (e.g. Andrieu et al. 2010 Doucet et al. 


Sherlock et al. 20151 in order to preserve the 


correct invariant distribution. We conjecture that the chopthin algorithm provides an unbiased estimate of 


the marginal likelihood. A proof could be based on a decomposition similar to the one used in (Del Moral 
2004, Proposition 7.4.1). 


For the model, presented in Section [4~T| the exact marginal likelihood can be computed using the Kalman 
filter, providing a comparison with the estimates given by the particle filter. For the same run of the 
simulation conducted in Section 4.1.3 we estimate the conditional likelihood as follows. Let y\.^ denote the 
observations simulated in iteration i for t = 1,... ,T. Then denote the estimate of p{yt\y\-t-i) foe iteration 
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Table 4: Simulation Results - linear Gaussian model: MSE values of log likelihood for various simulation 
parameters and different resampling methods. Presented MSE values are divided by the MSE from the 
simulations using systematic resampling. 



/3 

1 

N 

ay 

100 

1/3 

100 

1 

100 

3 

100 

9 

10 ^ 

1/3 

10 ^ 

1 

10 ^ 

3 

10 ^ 

9 

10 ^ 

1/3 

10 ^ 

1 

10 ^ 

3 

10 ^ 

9 

chopthin 

N 

3-f v/8 


0.92 

0.88 

0.85 

0.86 

1.07 

0.88 

0.85 

0.87 

0.89 

0.91 

0.89 

0.90 

multinomial 

0.5A 

- 


1.03 

1.07 

1.17 

1.22 

0.94 

1.07 

1.17 

1.24 

1.09 

1.06 

1.17 

1.23 

branching 

0.5A 

- 


1.02 

1.00 

0.99 

0.99 

0.98 

0.99 

1.00 

1.01 

1.03 

1.02 

1.00 

1.01 

residual 

0.5A 

- 


1.00 

1.00 

1.00 

1.00 

1.01 

0.99 

1.00 

1.01 

1.00 

1.00 

1.00 

1.02 

stratified 

0.5A 

- 


1.03 

0.99 

1.02 

1.03 

0.95 

1.01 

1.02 

1.05 

0.91 

1.03 

1.02 

1.01 

systematic 

N 

- 


0.99 

0.95 

1.06 

1.37 

1.05 

0.93 

1.10 

1.45 

0.95 

0.99 

1.11 

1.46 

systematic 

o.sv 

- 


1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 


i, for a given resampling method, ay, N and /3 as piutlui-.t-i)- In Tablewe report the following MSE 

r M T 

(logp(?/J|yut-i) - logp(2/J|2/ut-i))^ 

t i=i ^ t=i 

Based on the MSE results, the chopthin method approximates the log likelihood better than systematic 
and consistently for other resampling methods. 


4.2 Stochastic Volatility Model 

We now consider a more complicated model; a stochastic volatility model with hidden process G N and 
Yt G K for t G N defined by: 

iXt =0.9W_i+0.25et, e*iV(0,1) 

=0.16exp(W/2), 6 ~IV(0 ,1) 

with Xq ~ iV(0,1). Unlike the linear Gaussian model, the posterior distributions are not available in closed 
form. As a benchmark we approximate these distributions using a numerical approach that discretises the 
hidden state space into a fine grid. 

4.2.1 Results 

We repeat the same simulation described in Section [4. 1.1 [ for the stochastic volatility model, altering Algo¬ 
rithm 1^ accordingly. The MSE of the posterior mean and loglikelihood is presented in Table The results 
again show that using chopthin every iteration outperforms the other resampling method. 

The results in Tables and m show that for a fixed number of particles, chopthin outperforms other 
resamplers in terms of MSE. However, as illustrated in Table using chopthin is computationally more 
expensive than systematic resampling. Therefore, use of chopthin should be favoured when the computational 
expense of the other steps in the particle filter, i.e. the transition of the particle values and computational 
of the weights, exceed the expense of resampling. In scenarios where the transition or weight computation 
are cheap, using systematic resampling may be preferred. 


11 









Table 5: Simulations Results - Stochastic volatility model: MSE values for different resampling methods. 
Presented MSE values are divided by the MSE from the simulations using systematic resampling. Left table: 
MSE of the posterior mean, right table: MSE of the loglikelihood. 



P 

V 

100 

10 ^ 

10 ^ 

chopthin 

N 

4 

0.82 

0.82 

0.85 

chopthin 

N 

3 + 

0.84 

0.83 

0.87 

chopthin 

N 

10 

0.89 

0.89 

0.90 

chopthin 

0.5N 

3 + 

1.05 

1.04 

1.01 

multinomial 

0.5N 

- 

1.09 

1.10 

1.05 

branching 

0.5N 

- 

1.00 

1.00 

0.99 

residual 

0.5N 

- 

1.01 

1.00 

1.00 

stratified 

0.5N 

- 

1.00 

1.00 

0.99 

systematic 

N 

- 

1.00 

1.00 

0.98 

systematic 

0.5N 

- 

1.00 

1.00 

1.00 



P 

V 

100 

10 ^ 

10 ^ 

chopthin 

N 

3 + 78 

0.85 

0.83 

0.88 

multinomial 

0.5N 

- 

1.08 

1.09 

1.05 

branching 

0.5N 

- 

0.97 

1.00 

0.99 

residual 

0.5N 

- 

0.99 

1.01 

0.97 

stratified 

0.5N 

- 

1.00 

1.00 

0.98 

systematic 

0.5N 

- 

1.00 

1.00 

1.00 


5 Implied control of the Effective Sample Size 


The following lemma shows that imposing a bound on the ratio between the weights implicitly results in a 
lower bound on the ESS. It implies that chopthin has a lower bound on the ESS after resampling. 


Lemma 2. Suppose Wi,..., Wn > 0. Then 


ESS(w) 




where rj 


maxj Wj 
mini Wi 


Proof. In the case where all weights are equal, i.e. rj = 1, then ESS(rc) = n, thus inequality holds. From now 
on consider the case rj > 1. 

Let Wi = normalized weights corresponding to W. Then ESS(r(;) = ESS(IE). The set 

of possible normalized weights is compact and ESS is a continuous function, thus there exists a W* that 
minimises ESS. Without loss of generality, assume Wf < ■ ■ ■ < W*. 

The normalised weight W* has to be of the form W* = a for i < fc, Wf = ar, W* = rja for i < k, 
where k € {l,...,n — 1}, a > 0 and 1 < t < p. To see this let IE be a normalised weight vector for which 
there exist mutually distinct indices i,j, k, I such that Wi < Wj < Wk < Wi. Define a new weight vector V 
identical to W except for Vj = Wj — A, = Wk + A with A = min((W; — Wk)l2, {Wj — Wi)/2). Then 


1/ESS(E)=^K? 

= 2A2 + 2A(IEfc - Wj) + ^wl> I/ESS(IE) 

U 


which shows that W does not minimise ESS. Hence, W* can take at most 3 values, the middle one, if 
present, appearing exactly once. The two extreme values have to have a ratio of p, otherwise one could move 
them further apart and create a weight vector with smaller ESS. 

As — 1 + r + p{n — k)], we have 


ESS(IE*) 


[k — 1 + T + p{n — k)]'^ 
fc — I + + (n — k)p‘^ 

^ [k + p{n — k)]"^ 

~ k — l + {n — k + 1)772 


> inf h{x) 
1] 


wbprp hfoi — l^+v{-n-x)f 

wnere n(x) - ■ 

It remains to derive the minimum of h. Candidates for minimizers of h are x = pn/{p — 1) (which is not 
in the right range) and x = {p{n + 2) + 2)/{p + 1). Plugging this into h gives h{x) > 4□ 
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Larger 77 allow for more variability in the weights and thus should lead to lower effective sample sizes. 
Consistent with this, the lower bound on ESS is decreasing in rj. This can be seen by differentiating it with 
respect to r]. 

For large n, the leading term is 4 ■ Equating this to a desired minimal effective sample size yn gives 


V = 


2 - 7 + 2^1 - 7 


For example, for 7 = 0.5, this leads to 7 = 3 + -x/S. Furthermore, for 7 = 10, the lower bound on the ESS is 


^ « 0.33n. 


6 Discussion 

6.1 Why not only impose an upper or a lower threshold on the weights? 

The chopthin algorithm imposes a bound on the ratio of the largest and the smallest weight. Alternatively, 
one could have imposed only a lower or only an upper bound on the normalized weights. The following 
examples illustrate that there are situations in which these bounds would not lead to resampling despite 
very uneven weights. The chopthin algorithm (with rj < n) would even out the weights in both examples. 

Example 1. Suppose our weight veetorw of length n is produced by one importance sampling step, where the 
target distribution is a uniform distribution on [0,0.5] and the importance sampling distribution is a uniform 
distribution on [0,1]. Then roughly half of the weights will be approximately 2/n and half of the weight will 
be 0. None of the weights is large, so imposing an upper bound on the weights would not lead to resampling. 

Example 2. Consider the same setting as in the previous example, but now having as target distribution a 
mixture of two egually probably components: a uniform distribution on [0,1] and a uniform distribution on 
[0,1/u]. Suppose the first particle is in [0,1/n] and all other samples are greater than 1/n. Then the weight 
of the first particle is {n + l)/2 and the weight of all other particles is 1/2. Thus in this case, no weight is 
small, so imposing a lower bound on the weights would not lead to resampling. 


7 Summary 


In this paper we have introduced the chopthin algorithm which bounds the ratio between the weights. We 
showed, in simulations, that chopthin consistently outperforms standard resampling schemes used in particle 
filters. The simulations also demonstrated that chopthin can be used at every iteration in a particle filter 
with no detrimental effects. The chopthin algorithm can be implemented efficiently and we have proved 
that its expected effort is linear in the number of samples. Lastly, we have shown that imposing a bound 
on the ratio between weights implicitly controls the ESS. As mentioned in Section]^ use of chopthin within 
particle filters over other, less computational expensive, resamplers should be favoured when the expense of 
resampling is negligible in comparison to the other steps in the particle filter. 

Proving a central limit type theorem of the particle filter estimates using chopthin resampling is a natural 
next step. However, as the chopthin algorithm uses systematic resampling this will not be straightforward 
(Gentil and Remillard 2008 1 . Replacing systematic resampling with a resampling method more amenable 


to theoretical developments could be a topic for future research. 


13 









References 


Andrieu, C., A. Doucet, and R. Holenstein (2010). Particle Markov chain Monte Carlo methods. Journal of 
the Royal Statistical Society: Series B 72(3), 269-342. 

Bain, A. and D. Crisan (2009). Fundamentals of Stochastic Filtering. Springer. 

Carpenter, J., P. Clifford, and P. Fearnhead (1999). Improved particle filter for nonlinear problems. Radar, 
Sonar and Navigation, IFF Proceedings 146(1), 2-7. 

Davis, C. S. (1993). The computer generation of multinomial random variates. Computational Statistics & 
Data Analysis 16(2), 205-217. 

Del Moral, P. (2004). Feynman-Kac Formulae: Genealogical and Interacting Particle Systems with Applica¬ 
tions. Probability and Its Applications. Springer. 

Done, R. and O. Cappe (2005). Comparison of resampling schemes for particle filtering. In Proceedings of 
the 4th International Symposium on Image and Signal Processing and Analysis, pp. 64-69. IEEE. 

Doucet, A., N. de Ereitas, and N. Gordon (Eds.) (2001). Sequential Monte Carlo Methods in Practice. 
Springer. 

Doucet, A., M. K. Pitt, G. Deligiannidis, and R. Kohn (2015). Efficient implementation of Markov chain 
Monte Carlo when using an unbiased likelihood estimator. Biometrika 102(2), 295-313. 

Eearnhead, P. and P. Clifford (2003). On-line inference for hidden Markov models via particle filters. Journal 
of the Royal Statistical Society: Series B 65(4:), 887-899. 

Gentil, I. and B. Remillard (2008, 06). Using systematic sampling selection for Monte Carlo solutions of 
Eeynman-Kac equations. Advances in Applied Probability 40(2), 454-472. 

Hoi, J. D., T. B. Schon, and F. Gustafsson (2006). On resampling algorithms for particle filters. In Nonlinear 
Statistical Signal Processing Workshop, 2006 IFFF, pp. 79-82. 

Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Journal of basic engi¬ 
neering 82(1), 35-45. 

Kitagawa, G. (1996). Monte Carlo filter and smoother for non-gaussian nonlinear state space models. Journal 
of Computational and Graphical Statistics 5(1), 1-25. 

Liu, J. S. and R. Chen (1998). Sequential Monte Carlo methods for dynamic systems. Journal of the 
American Statistical Association 93(443), 1032-1044. 

Sherlock, C., A. H. Thiery, G. O. Roberts, and J. S. Rosenthal (2015). On the efficiency of pseudo-marginal 
random walk Metropolis algorithms. The Annals of Statistics 43(1), 238-275. 

Whitley, D. (1994). A genetic algorithm tutorial. Statistics and Computing ^(2), 65-85. 


14 



